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1. INTRODUCTION 

In physiology and anatomy, "allometric scalings" are 
empirical power laws among percentiles related to size. 
For example, the brain mass of mammals scales as the 
corresponding body mass to the power about 0.7 [l[ . One 
of the most famous laws in this field is that between body 
mass and metabolic rate (i.e., the speed of metabolism), 
which has the scaling exponent 2/3 0. From the view- 
point of statistical physics, this nontrivial scaling relation 
is explained by the geometric structure of vessel networks 
and an assumption regarding minimum energy consump- 
tion 0. 

Recently, allometric scalings have been observed in the 
real world in various complex systems other than biolog- 
ical systems, and there are many attractive societal ap- 
plications; for example, economic indices as a function 
of urban population [J, Q energy consumptions vs urban 
population Q, surface area of roads vs that of cities 0, 
or economic indices vs national populations 

Fluctuations associated with these scalings in complex 
systems have also been studied. In these studies, the 
distribution of growth rates is one of the main topics, 
and the width of growth rates (e.g., the standard devi- 
ation or the interquartile distance) vs system size has 
been found to follow a power law with a negative expo- 
nent 0, In accordance with this scaling, the con- 
ditional distributions of growth rates normalized by the 
widths or the standard deviations conditioned by the sys- 
tem size collapse onto universal curves, which are inde- 
pendent of system size. Such conditional distributions 
of growth rates have been reported for sales of business 
firms [§, [ll|, , national gross domestic products [l2| , 
university research activities [l3j . citations to scientific 
journals jl4j. the circulation of magazines and newspa- 
pers 15 1, religious activities birds populations [TtJ 
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and the metabolic rates of animals [l(| etc. This charac- 
teristic is also commonly observed between the metabolic 
rates of animals and business firms (To| . 

Here, we focus on the statistical properties of business 
firms and regard each firm as a typical complex system 
consisting of various elements such as employees, facili- 
ties, and money. Firm activity, in the form of financial 
reports, is rendered numerically observable. The data 
within typical financial reports contains many quantities 
relating to firm size, which we can roughly categorize into 
three families: 

1. Flow variables; such as annual sales, profit, in- 
comes, or tax payments. 

2. Stock variables; such as the number of employees, 
number of branches, or number of factories. 

3. Business relations; such as the number of business 
partners or number of affiliated firms. 

Quite interestingly, one body of statistics based on 
these quantities is generally approximated by a power 
law distribution that is typically independent of country 
and observation year; namely, the universal Zipf law for 
annual sales or profits [ill, EM Ell • There have been many 
attempts, typically based on mathematical toy models 
based on stochastic scale-free dynamics, to clarify why 
such a power law should hold for a one-body distribution 

A few pioneering works exists on allometric scaling of 
business firms. For example, Fujiwara et.al. reported 
that employee numbers and incomes scale with the corre- 
sponding universal conditional distribution for Japanese 
business firms (up to intermediate size) |2ll ]. Watanabe 
et.al have also analyzed these financial scalings by using 
the production function [22| and Saito et.al have showed 
a scaling relationship between numbers of business part- 
ners and annual sales [23j . 

In this study we analyze two- and three-body statistics 
of typical business variables from the three data cate- 
gories of annual sales, number of employees, and number 
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of business partners. In particular, we focus on the re- 
lation between the scalings among the three quantities 
and the fluctuations associated with them. By analyzing 
data from about 500, 000 Japanese firms, we find in Sec. 
2 that some pairs of these quantities follow power laws. In 
addition, we show that the distribution functions for dif- 
ferent parameters converge to a unique scaling function 
through these scaling relations of conditional medians. In 
the same section, we also find, for three-body relations, 
scalings of the conditional median of sales and employees 
as a function of the other two variables. In Sec. 3, we 
introduce simple stochastic models that reproduce the all 
empirical scalings and discuss the relations between these 
scalings and fluctuations. Finally, we conclude with a 
discussion in Sec. 4. 



2. DATA ANALYSIS 

The data set was provided by the governmental re- 
search institute RIETI (Research Institute of Economy, 
Trade and Industry) and was based on data collected by 
Tokyo Shoko Research, Ltd. (TSR) for 2005. It contains 
approximately one million firms covering practically all 
active firms in Japan. For each firm, the data set con- 
tains various flow variables, stock variables, and a list 
of business partners categorized into suppliers and cus- 
tomers [24|]. From this list, we count the total number 
of business partners, by superposing all business inter- 
actions. We focus on the three basic scale indicators of 
firms from the three categories: sales s, number of em- 
ployees I, and number of business partners, which we call 
the degree k. We neglect those firms for which the three 
data are not available, thus that the number of firms we 
analyze is 529,291. 



2.1. Correlations between two variables 

In general, all information regarding three-body statis- 
tics for stochastic variables {X, Y, Z} is contained in 
the three-body probability density function (PDF), 
P(X,Y,Z). To clarify the structure of this func- 
tion, using the definition of the conditional probabil- 
ity, we decompose it into the three density functions as 
P(X, Y, Z) = P(X\Y, Z)P(Y\Z)P(Z), where we denotes 
the conditional probability density of Y for given value 
of Z by P(Y\Z), and where P(X\Y, Z) is the conditional 
probability density of X for simultaneously given val- 
ues of Y and Z. We pay attention to the properties of 
these conditional probability densities. Firstly, we are 
going to observe the probability densities conditioned by 
one variable, P(Y\Z), and then the probability densities 
conditioned by two variables P(X\Y, Z). 

We begin by analyzing the two-body relations between 
the number of employees I and degree k. Fig. HJa) shows 
the log-log plot of the number of employees as a function 
of degree k. We find that all such plots have similar 



forms for the 5th, 25th, 50th (equivalent to the median), 
75th, and 95th percentiles of the number of employees I 
for a given degree k. In Fig. HJb), we shift these plots 
along the vertical axis so that they all lie on the median 
plot at k = 100. All these conditional percentile curves 
essentially coincide with each other. In particular, for 
k > 30, this relation can be described by the following 
scaling relation: 

<l\k > q = B„ m ■ k^" (q = 0.05, 0.25, 0.5, 0.75, 0.95), 

(1) 

where 7/^ = 1.0, < l\k > q is the 100g conditional per- 
centile of I given k and B q ^ k \s a proportional constant 
for percentile 100q. The values of B q ^ are estimated to 
be 0.3 for the 5th percentile, 0.7 for the 25th percentile, 
1.6 for the 50th percentile, 4.0 for the 75th percentile and 
12 for the 95th percentile. B q ^ can be interpreted as 
the number of employees per business partner. We find 
the typical value at the median is 1.6. According to these 
percentile scaling relations, the PDF of I for a given value 
of fc, P(l\k), is 



P(l\k) 



1 



h(k) lv /i(fc) 



(2) 



where fi(k) =< l\k >o.s is the scaling function and 
is the PDF of the normalized quantity, = l/fi(k), 
which does not depend on fc. Noted that, because the 
lower limit of the number of employees is 1, the PDF has 
a cut off for small k. In Fig. HJc), we plot the condi- 
tional PDF P(l\k) for several values of k, which shifts 
right with increasing degree k. In Fig. HJd), we see that 
plot of ^i(lk) actually does not depend on the value of 
k. 

We apply a parallel analysis for relations between sales 
s and the degree k. Thus, Figs. EJa) and (b), we can 
confirm that, for all range of k, all of these conditional 
percentile curves essentially coincide with each other af- 
ter shifting them along the vertical axis. For k ranging 
from 30 to 1000, the following nontrivial scaling relation 
holds for the conditional percentile values < s\k > q : 



< s\k > q = flW*) ■ k^ k 



where j s \ k 
(yen), i?£f 



B, 



(s\k) 
0.95 



(q = 0.05,0.25,0.5,0.75,0.95), 

(3) 

1.3, = 3.8 -10 6 (yen), 5<f 5 } = 14 -10 6 

32 • 10 6 (yen), B^ = 60 • 10 6 (yen) and 



190 • 10 6 (yen). Note that this scaling exponent 
value, 7 s |k = 1.3, differs significantly from that for the 
employees, 7^ = 1.0. This result implies that the mean 
of "sales per degree" increases with increasing number 
of business partners. The conditional PDFs of sales for 
different k, P(s\k) are plotted in Fig. [He). This function 
is expected to be expressed by a scaling function as 



P(s\k) 



1 



* 2 (- 



Mk) zy f 2 (k) 



-) 



(4) 



where f2(k) =< s\k > .5 is the scaling between s and k 
at the median point. The PDF of sales normalized by 
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FIG. 1: Scaling relations for number of employees condi- 
tioned by degrees, (a) Conditional percentile of the num- 
ber of employees / given degree k. The data shown are 5th 
percentile (black triangles), 25th percentile (red plus signs). 
50th percentile (green nablas) ,75th percentile (blue squares), 
and 95th percentile (purple crosses), (b) Corresponding per- 
centiles obtained by shifting plots in panel (a), along the 
vertical axis so that they overlap with the median plots at 
k = 100. The black dashed-dotted line shows the slope 
of k 10 . (c) Conditional PDF of number of employees I for 
given degree k, P(l\k), where the conditional parameter k 
was evenly divided in logarithmic space into eight boxes, 
(d) Conditional PDF of the normalized number of employees 
Ik = 1/ < l\k >o.5 given degree k. 
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FIG. 2: Scaling relations of sales conditioned by degrees. 

(a) Conditional percentile of sales s given degree k. The 
data for the 5th percentile (black triangles), 25th percentile 
(red pluses signs), 50th percentile (green nablas) ,75th per- 
centile (blue squares), and 95th percentile (purple crosses). 

(b) Corresponding percentiles obtained by shifting plots in 
panel (a) , along the vertical axis so that they overlap with the 
median plot at k — 100. The black dashed-dotted line shows 
a slope of fc 1 ' 3 . (c) Conditional PDF of sales s for given de- 
gree k, P(s\k), where the conditional parameter k was evenly 
divided into eight boxes in logarithmic space, (d) Conditional 
PDF of normalized sales s k = s/ < s\k >o.s given degree k. 
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using this scaling, s~k = s/f2(k), is plotted in Fig. 

M,d), and we confirm that 'I' 2 is independent of degree k. 

Finally, we investigate the relation between sales and 
number of employees by a parallel analysis just like that 
of the other pairs. As shown in Figs. G3a) and 3(b), we 
obtain the following scaling relation between sales and 
number of employees: 



< s\l > B 



9 



V°v {q = 0.05, 0.25, 0.5, 0.75, 0.95), 

(5) 

where < s\l > q denotes the percentile 100g of sales given 
by employee numbers I, 7 s i; = 1.3, B^os = 5-5 ' 10 5 



(yen), B t 



0|i) 



(-10 
0.75 



22 • 10 5 (yen), B { 
Ml) 



(s\l) 
0.5 



80 • 10 5 (yen) and B^J> = 19 



0.05 

: 47 • 10 5 (yen), 
10 6 (yen). The 



conditional PDF of sales, P(s\l), is plotted in Fig. (3Jc) 
and the corresponding PDF of the normalized variable, 
^3( s / /a(0) i s plotted in Fig. Eld). The normalized vari- 
able ^3(57/3(0) is defined by, 



P(s\l) 



*3(t77t)' 



Mi) 7 3 (0 



(6) 



where —< s\l >o.5- The results shown in Fig. EJd) 
demonstrate that all conditional PDFs collapse into a 
single function as expected. This scaling relation agrees 
with Eqs © and Eqs (gj 

Integrating over the conditioned variables we have the 
PDF of a single body variable from the conditioned PDF. 
For each variable, k, I and s, the PDF is plotted in Fig. @] 
(black solid lines) on a loglog scale. We have the following 
power laws: 



P(k) cx ft _Ch_1 ; P(l) cx r fi_1 ; P(s) cx s 



-C-i 



(7) 



where Cfc = 1.3, Cl = 1-3 an d Cs = 1-0. These exponents 
are directly related to the scaling exponents, as shown 
below. 

Assuming that X obeys the following power-law dis- 
tribution with the PDF: 



Px 



(X) cx X-C*- 1 



(8) 



and also assuming that X and Y satisfy the allometric 
scaling relation 



Y oc X' 



(9) 



where ~{y\x is the scaling exponent, then by a simple 
variable transformation, the PDF of Y is given as 



PY (Y)^px(X) 



dX 



dY 



oc 



(10) 



Thus, we get the following relation between the power 
law indices: 



1Y\X = Cx/Cy- 



(11) 



This relation is confirmed in our data analysis, 7(ij. 
Cfe/0 = 1-0 , ls\k = Cfe/G = 1-3 and 7;| s = 0/G = 1-3. 
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FIG. 3: Scaling relations of sales conditioned by number of 
employees, (a) Conditional percentile of sales s for a given 
number of employees I. The data shown are 5th percentile 
(black triangles), 25th percentile (red pluses signs), 50th per- 
centile (green nablas), 75th percentile (blue squares), and 
95th percentile (purple crosses), (b) Corresponding percentiles 
obtained by shifting these plots so that they along the vertical 
axis to overlap with the median curve at I = 100. The black 
dashed-dotted line has a slope of I . (c) Conditional PDF of 
sales s for a given number of employees I, P(s\l), where the 
conditional parameter I is evenly divided in the logarithmic 
space into eight boxes, (d) Conditional PDF of normalized 
sales si — a/ < s\l >o.s for a given number of employees I. 
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FIG. 4: PDFs of degree k, number of employees I, and sales 
s for empirical data (black solid lines), shuffled model (red 
dashed line), and lognormal distribution model (green dash- 
dotted line), (a) PDFs of degree k. The black dashed line 
shows fc 2,3 . (b) PDFs of employee I. The black dashed line 
shows I 2 ' 3 , (c) PDFs of sales s. The black dashed support 
line shows s 2,0 . This figure confirms that sales s obey Zipf's 
law. 
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FIG. 5: (a) Conditional median of sales s given degree k 
and number of employees I, < s\k,l >o.s. The contour lines 
provide the sales s (1000 yen) in common logarithm (e.g., 
"5" in the figure means 10 8 yen). The red dashed line is 
I oc k 44 . (b) Conditional median of number of employees I 
given values of degree k and sales s, < l\k,s >o.b- The contour 
lines provide I in common logarithm. For example, "3" in the 
figure means I — 10 3 . (c) Conditional median of degree k 
given number of employees I and sales s, < k\l,s >o.5. The 
contour lines give k in common logarithm. 



2.2. Correlations among three variables 

In this subsection, we investigate the dependence of a 
given variable on the others. Here we plot only the me- 
dian values of X to characterize of the conditional prob- 
ability density P(X\Y, Z) because the number of observ- 
able samples is not sufficiently large by conditioning two 



variables Y and Z. Although the mean value is another 
candidate for characterizing the probability density, the 
median value is much more robust than the mean value 
for outliers, thus we use the median value because the 
data we are analyzing include outliers. 

Fig. [S] shows contour plots of < s\k,l >o.5, < 
l\k,s >o.5, and < k\l, s >o.5, where < X\Y,Z >o.5 is 
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FIG. 6: (a)Dependence of < s\k, I >o.s on sales k regarding I as fixing parameter for 1 < I < 2 (black triangles), 10 < I < 20 
(red pluses signs), 100 < I < 200 (green nablas) and 1000 < / < 2000 (blue diamonds). All supporting lines are proportional to 

ocfc ' 4 . 

(b) Dependence of < s\k, I >o.5 on sales I regarding k as fixing parameter for 1 < k < 2 (black triangles), 10 < k < 20 (red 
pluses signs), 100 < k < 200 (green nablas) and 1000 < k < 2000 (blue diamonds). All supporting lines are proportional to 
ocZ ' 9 . 

(c) Dependence of < l\k, s >o.s on the employee k regarding s as fixing parameter for 1 ■ 10 3 < s < 2 • 10 3 (1000 yen) (black 
triangles) , 1 • 10 5 < s < 2 • 10 5 ( 1000 yen) (red pluses signs) , 1 ■ 10 7 < s < 2 • 10 7 ( 1000 yen) (green nablas) and 1 • 10 9 < s < 2 ■ 10 9 
(1000 yen) (blue diamonds). All supporting lines are proportional to oc fc 01 . 

(d) Dependence of < l\k, s >o.s on the employee s regarding k as fixing parameter for 1 < k < 2 (black triangles), 10 < k < 20 
(red pluses signs), 100 < k < 200 (green nablas) and 1000 < k < 2000 (blue diamonds). All supporting lines are proportional 
to oc s ' 7 . 



the conditional median of X for the given values of Y 
and Z. The contour lines of < s\k,l >o.5 are charac- 
terized by oblique lines with a slope of 0.44, as shown 
in Fig. HJa), whereas the contour lines of < l\k, s >o.5 
are approximated by almost-horizontal lines, as shown in 
FigOb). Similarly, the contour plots of < k\l, s >o.5 are 
shown in Fig. [SJc), which is clearly different from the 
former two cases. 

To better understand of these correlations, we investi- 
gate the dependence of the conditional median for each 
variable. Fig. |6][a) shows how < s\k,l > .5 depends on 
degree k when we regard I as a fixed parameter. From 
this figure, we find that the value of < s\k, I >o.5 is char- 
acterized by a power law with base k and exponent 0.4. 



From Fig. [f^b) we see that < s\k,l > .5 depends on the 
degree I when we regard fc as a fixed parameter. We find 
that < s\k, I >o.5 is proportional to I 9 . Combining these 
two results, we have the following scaling law: 

< s\k,l > . 5 oc fc^.< (fc) . p^/ l \ (12) 

where -y s \ k / k) = 0.4 and 7 s | fc . ; (i) = 0.9. 

Fig. |H{c) shows how < l\k,s >o.5 depends on the de- 
gree k when we regard s as a fixing parameter. From this 
figure, we can see that < l\k, s >o.5 is proportional to 
fc - 1 . Similarly, from Fig. Hl^d), we find that < l\k, s >o.5 
is proportional to s ' 7 . Thus, we have the following seal- 
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ing law: 

< l\k,s > . 5 oc ■ s^ k -° is \ (13) 

where 7/|fc. s ^ = 0.1 and 7z|fc. s ' s - ) = 0.7. Note that the 
non-trivial scaling relations, Eqs. (|12[) and (fTB"]) , can be 
derived by carefully analyzing the conditional statistics 
of three variables. 



3. THE MODELS 

To clarify any mutual relation between the above- 
mentioned empirical scalings, we now introduce some 
simple models. First, note that the empirical relations, 
Eqs. (fl"2"]) and (fTB")) , seem inconsistent if we neglect fluctu- 
ations. For example, if wc assume Eq. (|12p . s oc k 0A -l 9 , 
then we have I oc k~° s 1 ' 1 . However, this result dis- 
agrees with the empirical scaling given by Eq. (|13|) . 
Therefore, to reproduce the empirical observations, we 
must take into account the effects of fluctuations, which 
modify the scaling relations. Here, we introduce a simple 
model involving k, I, and s that assumes that these vari- 
ables are derived from three independent random vari- 
ables K, B, and A as follows: 

k = K (14) 
l = B-k (15) 

s = A-l a -hP, (16) 

where Eqs. (JTSj) and (US]) refer to Eqs. (JT) and (|T2j) 
respectively. Here, a — 0.9 and f3 = 0.4. 

In this model, we determine k, I and s in the following 
order: 

1. We determine the degree k by sampling random 
variables, which we specify in the following discus- 
sion of K. 

2. Wc determine the employee I from Eq. (|15[) and by 
using the degree k determined in the previous step, 
where the value of B is determined by a random 
variable. 

3. We determine s from Eq. (|16[) and using k and I 
determined in the previous steps, where the value 
of A is determined by a random variable. 

3.1. Shuffled model 

First, let us introduce the model in which we choose 
random variables from the real values by using a boot- 
strapping method. We randomly rcsample K from shuf- 
fled actual degrees ki (i = 1,2, ■ • • , N), and B and A 
are similarly chosen randomly from shuffled actual data, 
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TABLE I: Summary of scaling exponents. 



h/h (i = 1,2,- •■ ,2V), Si /(kf -If) (i = 1,2,-.. ,N) 
respectively. Figs. Ufa), (b), and (c), show the compar- 
isons between simulation results and actual results with 
respect to < s\l,k >o.5, < l\k,s >o,5, and < k\l,s >o.5- 
The results shown in there figures confirm that the model 
shown by black solid lines almost reproduces contours of 
the actual data, which are shown by red dashed lines. In 
addition, wc also confirm that the model reproduces the 
conditional probabilities between two variables, P(l\k), 
P(s\k), and P(s\l), which are shown by the red dashed 
line in Fig. [8] and the marginal distributions P(k), P(l), 
and P(s) shown by the red dashed line in Fig. 2) In 
all cases the distributions are nicely reproduced by this 
shuffled model. 

The differences between this model and actual phe- 
nomena as follows: (i)The correlation between B and k 
is removed. (ii)The correlation between A and l a k^ is 
removed. (iii)With respect to Eqs. (Til)]) and ([TS]). there 
are non-power-law regions for small values for the case of 
actual observations. However, we approximate the single 
power laws for all regions of the model for simplification, 
(iv) Discrete quantities for actual data are approximated 
by continuous quantities. 

In addition, in general, this model is not only the one 
that can reproduce empirical scaling relations, for ex- 
ample, we can change the order of the variables. By 
checking all combinations we find that this model with 
the given order of construction produces most accurate 
results upon comparing with the real data in our frame- 
work. This simple reconstruction model is based on 
the definition of conditional probability for three-body 
stochastic variables and uses the empirically derived scal- 
ing relations for the conditional probability densities. 



3.2. Lognormal distribution model 

Next, we investigate how the scaling exponents de- 
pend on the magnitude fluctuations. HcrcCthe ana- 
lytical calculation is done by approximating the dis- 
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FIG. 7: (a) Conditional median of sales s for given degree k and number of employees /, < s\k, I >o.s, for shuffled model. The 
black lines are the contour lines for a given s (the numbers on the lines denote the digits of annual sales in multiples of f 000 
yen; for example, "5" in the figure means 10 s yen). The red dashed line provides the corresponding contours for actual data 
(we plot the other cases in a similar manner), (b) Conditional median of the number of employees I for given degree k and 
sales s, < l\k, s >o.s, for shuffled model. The contour lines of I are plotted in a similarly manner as for panel (a) (the numbers 
on the lines show the digits; for example, "3" in the figure means I = 10 3 ). (c) Conditional median of degree k given by number 
of employees / and sales s, < k\l,s >o.s, for shuffled model, (d)-(f) Panels corresponding to panels (a)-(c), respectively, for 
lognormal distribution model. 
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FIG. 8: Comparison of conditional PDFs, P(l\k), P(s\k), and 
P(s\l) for actual data (black solid lines), shuffle model (red 
dashed lines) and lognormal distribution model (green dash- 
dotted lines), (a) Conditional PDF of employee I for given 
degree k, P(l\k) for k = 3 (thin lines), k = 14 • 10 (medium 
lines) and k = 14 ■ 10 2 (thick lines) . For actual data and the 
shuffle model, we apply the following conditions to include a 
sufficient number of samples: k = 14 • 10 w V100 • 200 for the 
interval 100 < k < 200 and k = 14 • 10 2 « vTOOO ■ 2000 for 
the interval 1000 <k< 2000. For the lognormal distribution 
model, the PDF is given by Eq. (|B2[) . (b) Corresponding plots 
for conditional PDFs of sales s for given k, P(s\k). For the 



lognormal distribution model, the PDF is given by Eq. (|B5|l . 
(c) Conditional PDFs of sales s for given I, P(s\l), for I — 3 
(thin lines), I = 14 • 10 (medium lines) and I = 14 • 10 2 (thick 
lines). For actual data and the shuffle model, we apply the 
following conditions: I = 14 • 10 w V100 • 200 for the interval 
100 < Z < 200 and I = 14 • 10 2 « V1000 • 2000 for the interval 
1000 <1< 2000. For the lognormal distribution model, the 
PDF is given by Eq. (fB10)) . 



FIG. 9: (a)PDF of coefficient. A given by Eq. (HSJ for actual 
data (black solid line: by definition the shuffle model gives the 
same PDF), and for the lognormal distribution model (red 
dashed line). (b)PDF of B given by Eq. (fT5l> for actual data 
(black solid line: by definition the shuffle model gives the same 
PDF) and for the lognormal distribution model (red dashed 
line). For both cases, the central parts of the real distributions 
are approximated by the lognormal distribution model. 



tributions of A and B by log-normal distributions 
4>'(A; 4>' {B; hbtVb) and K by the Pareto dis- 

tribution q(K; A, fc rn ), where 



2nax 



exp 



(ln(z)-^ 
2a 2 



q(x;X,x m ) 



(x m < X < 00) 



(0 < x < oo) 
(17) 

(18) 



The real data is approximated at best with the set of 
parameters; \ia = 9.7, <ja = 0.88, \ib = 0.72 and ob = 
1.2, A = 1.3 and k m = 3. In this study, we refer to the set 
of these values as the best parameter set. Here, \ia and 
fi b are estimated by the mean of the actual values A and 
B, a a and <tb are estimated by the standard deviation of 
the data. The quantities of A and k m are determined by 
the fitting of Eq. (fT5)) to the real data shown in Fig. Hta) 
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by the green-dash-dottcd lines. From Fig. [9l we see that 
the central part of the actual distributions is reasonably 
approximated by these lognormal distributions for A and 
B. However, significant disagreement occurs for the tail 
parts. In addition, for K, the tail part of the empirical 
distribution (i.e., above k m ) is well approximated by the 
above-mentioned Pareto distribution, which is shown by 
the green dash-dotted lines in Fig. 0Ja). 



3.2.1. Correlations among three variables 

Here, we discuss the relation between < s\k,l >o.5, 
< l\k,s >o.5 and < k\l,s >o.5- If A and B follow a 
log-normal distributions and K follows the Pareto dis- 
tribution, we can calculate these values rigorously. The 
details of the derivation are given in Appendix |A"1 In this 
section, we give only the results. 

< s\k, I >o.5 and < l\k, s > .5 can be written as 

< s\k,l> . 5 <xl a -k? (19) 

< l\k,s> . 5 (xk- UK ' +1 -s Kl , (20) 



whcr 



P and ki = 



These equations 



agree well with the real data [see Figs. |7£d) and (e)]. 
Here, the scaling indices for the conditional scaling rela- 



tions are given by the model's parameters as 7 



(0 



(fc) 



■ land 7j ( |*, ( 



(fc) 

s\k,l 



For the best parameter set, 7 
0.1 and 



0.9, 



s\k,l = ^-4, ^s\k,l 

(k) (s) 

H\k s = ^-1 an< ^ n\k s = wmcn agrees with empirical 
scaling indices (see Table 1). Note that Eq. (|20p im- 
plies that the scaling exponent of < l\k,s >o.5 depends 
on the magnitude of the fluctuations of A and B. For 
example, in the limit a\ — > 0, we have < l\k,s >o.goc 
k~Pl a ■ s 1 /", which corresponds to the analytical solu- 
tion of s = A ■ l a ■ k@ given by Eq. (fl6|) with respect to 
I neglecting the fluctuation. Conversely, for the <j 2 B — > 0, 
< l\k,s >o.5°c k, which corresponds to the solution of 
I = B -k given by Eq. (fT5j) . 

From the rigorous formula of < k\l,s > given by Eq. 
(|A13[) and in the case of best parameter setCwe can get 
the following the scaling law for z — > — 00: 

(21) 



< k\l, S >o.5fX I 

where n k ^ p/(p 2 +cr 2 A /a 2 B ). 

In the central region, the theoretical curves roughly 
agree with actual curves; however, they disagree near the 
extremities. Comparing Fig. [3c) with Fig. 07), we see 
that the cause for disagreements around the edges of the 
contour lines comes from the deviation in the tail portion 
of the distributions of A and B (shown in Fig. [9]) . 



3.2.2. Correlations between two variables 

Here, we calculate the conditional distributions for two 
variables theoretically based on the log-normal model. 



The details of derivation are given in Appendix [Bl 

From Eqs. (|B3|) and (|B6|) . the conditional percentiles 
of I given by k and the conditional percentiles of s given 
k can be written as: 



< l\k > q tx k. (0 < q < 1) 



<s|fc>,cxfc !y (0 < q < 1), 



(22) 



(23) 



which corresponds to the empirical equations Eq. ([T]) and 
equation Eq. (|3|) respectively. Thus, = 1 and 7 s | fc = 
v — a + P = 1.3 for the best parameter set. Similarly, 
from Eq. ma, we have the conditional percentiles of s 
given I, 



< s\l > q cc V (I — > 00; o\la 2 B > ap), 



(24) 



where <q< 1. This equation corresponds to the empir- 
ical equation Eq. ([5]), namely, 7 s i; = i/ = a + / 8 = 1.3. 

We can also analytically calculate the conditional dis- 
tributions, P(l\k), P(s\k) and P(l\k). From Fig. [3 we 
see that, except for the tail portions, the empirical curves 
plotted as black lines agree well with the green dash- 
dotted line, which ensures the validity of Eqs. (|B2|) . (|B5[) 
and (|B10[) . Note that the discrepancies are again because 
of the deviations in the tail portions of the distributions 
of coefficients for A and B. 



3.2.3. Marginal distributions 

Finally, we calculated the conditional distributions for 
the marginal distributions. The details of the derivation 
are given in Appendix [Cl 

The marginal distribution of k is given by the distri- 
bution of K, q(K; X,k m ). With k is given by Eq. (fT4| . 
we have the following power law distribution: 



P(k) cx k 



-A-l 



(k m < k < 00). 



(25) 



Thus, Oc = A, which takes 1.3 for the best parameter set. 

The asymptotic behavior of the marginal distributions 
of I and s arc derived as: 



P(l) oc I 



-A-l 



(I — S- 00). 



P(s) cx s <*+f> 1 (s -> 00). 



(26) 



(27) 



Thus, for the best parameter set, Q = A, which takes 1.3, 
and £s = A/ (a + 0) which takes 1.0. These equations 
correspond to the empirical PDF given by Eq. ([7|l. 

The green dash-dotted lines in Figs. [4] (a)-(c) arc the 
theoretical curves given by Eqs. (|Clj) . (|C3j) and (|C7|) . 
These figures show that the empirical distribution are 
closely fit by the theoretical curves. 

Table U summarizes the scaling exponents mentioned 
in Sec. [5] derived from empirical observations and the 
corresponding theoretical exponents discussed in Sec 
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4. DISCUSSION AND CONCLUSION 

In this study, we analyzed the scaling behavior of scale 
indicators for Japanese firms. In particular, we focused 
three basic scale indicators: sales (flow value), number 
of employees (stock value), and number of business part- 
ners (business relation). First, by analyzing the financial 
data of about 500,000 Japanese firms, we established the 
following relations: 

(i) The conditional percentiles scale with the expo- 
nent about 1.0 for number of employees based on 
degrees, with exponent about 1.3 for sales based 
on degrees, and with exponent about 1.3 for sales 
based on the number of employees; 

(ii) Corresponding conditional distribution functions 
converge into a unique scaling function, through 
the scaling relations of the conditional medians, re- 
spectively; 

(iii) New scaling relations appear between three vari- 
ables, such as the scalings of conditional median 
of sales based on the numbers of business partners 
and employees. 

Second, we introduced simple stochastic models that 
reproduce all empirical scaling relations consistently, and 
we derived the nontrivial relation between scalings in- 
dices and fluctuations. To provide a consistent explana- 
tion of these three-body scaling relations, we show that it 
is necessary to consider the effects of fluctuations in co- 
efficients. In other words, scaling indices depend on the 
magnitude of the fluctuations. It is interesting that for 
two-body relations, which have been well cultivated, the 
fluctuations do not modulate the exponents. To clarify 
such an effect on the allometric scaling relations, a more 
in-depth study is required into situations involving more 
than three variables. 

Regarding the scaling of the metabolic rate of mam- 
mals, the geometric structure of a vessel network has been 
shown to explain the allometric properties. Similarly, we 
can pose a basic question; namely, can we explain our 
empirical scaling relations from the network structure of 
the interfirm trading relation? In our recent study, we 
showed that the scaling of sales based on degree and with 
exponent 1.3 and the power law distribution of sales with 
the exponent 1 are explained by the transport of money 
through the interfirm trading network [25(. Moreover, 
this transport model explains the scaling of sales based 
on employees with exponent 1.3. However, in the present 
form, this transport model cannot reproduce all the scal- 
ing relations for the three variables. It is our task in the 
near future to pursue the network model, so that the key 
coefficients A and B in Eqs. (fT^ - tfTS")) can be estimated 
by the information of the network structure. We can also 
associate these properties with the interfirm trading net- 
work. A detail survey of along these lines will be reported 
in a future presentation. 
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Appendix A: Conditional medians, < s\k, I >o.s, 
< l\k, s >o.5, and < k\l, s >o.s 

Here, we calculate the median of s for given I and k, 

< s\l,k >o.5- Taking the logarithm, Eqs. (TT4Tj - ([T6")) can 
transform into 

k' = K' (Al) 
l'=B' + k' (A2) 

s' = A' + al' + /3k' (A3) 

where s' = log(s), k' = log(fc), I' = log(i), A' = log(A), 
B' = log(B) and K' = \og{K). Thus, the PDF of A' 
is 4>(A'; fiA, &a), the PDF of B' is 0(5'; hb,&b) and the 
PDF of K' is Aexp(A- (A'' — log(/cm))). Here, 4>{x,n,a) is 
the PDF of the normal distribution whose mean is /i and 
standard deviation is a. Because A' obeys the normal 
distribution, the conditional mean of logarithmic sales 

< s'\k',l' > is 

< s'\k',l' >=< A' > +al' + (3k' = n A +al' + l3k'. (A4) 
Thus, we get 

< s\k, I > .5= exp (< s'\k', I' >) = cxp(fi A )l a k p , (A5) 

where we use a property of the lognormal distribution, 
namely, if x has the lognormal PDF (f>'(x, pi, a), then the 
median of x is exp(/i). This equation is Eq. (|19p in Sec. 

Similarly, we calculate the value of < l\k, s >o.sD From 
Eqs. (jA2j) and (jA3|) . we have 

s' = A' + /3k' + a(B' + k') = vk' + c, (A6) 

where c = A' + aB' and v = a + f3. 
Here, c is fixed for a given k and s is determined. There- 
fore, the condition by k and s is equivalent to the condi- 
tion by c = s' — vk': 

< B'\k',s' >=< B'\c> . (A7) 

From Bayes' theoremCthe conditional probability of B' 
is estimated using 

P{B'\c) oc P(c\B')P(B') 

oc (t>{c;^A +aB',a A ) ■ 4>{B' ; i±b,°b) 

oc (f>(B';n B , lc ,<j B , lc ), (A8) 



12 



where fj,B'\c = K i c + T h a B'\c = caV Ki/a ,Ki = a/(a 2 + the standard deviation is 0V|;' jS ', the minimum value is 
a 2 A /a 2 B ) and r x = m/a ■ {-apA + o 2 A /<r 2 B ■ /is). Thus, we log(fc m ) and the maximum value is oo. Applying a gen- 
have < B'\c >= kic + n. We take the conditional mean eral formula of the median of a truncated normal distri- 
of Eq. ([M]) and substitute it into Eq. ([ATI) , which gives bution, we get 



< l'\k', s' >= k+ < B'\k', s' >= {-vki + l)k' + k/s' 



(A9) 



ThereforeC 

< l\k, s > .s= exp(< l'\k',s' > .s) oc fc-™ i+1 s re '(A10) 

This equation gives Eq. (|20|) in Sec. (|3.2.1|) . Here, we 
use a property of the lognormal distribution; that is, if x 
has the lognormal PDF <f>'(x, p, a), then the median of x 
is given by exp(/z). 

We also calcualte < k\l, s >o.s, Because A' and B' obey 
the normal distributions [from Eqs. (|A2[) and (|A3|) ]. the 
conditional probability of I' and s' for given fc' is written 
as: 

P(l',s'\k') = f(l' ' ,s';p s .\k',Pi'\k',^s'\k',^l'\k',p) (All) 
where 



/V 


A-' 


= vk' 


+ fi A + afi B 


/'•/' 


1*' 


= fc'-f 


■ fJ-B, 




A-' 


= 


i 2 2 

f a a B , 


a? 


fc' 


2 

= (T B> 








= a , 



and / is the PDF of the multivariate normal distribution: 

,(x - fl x ) 2 



f{.x,y;fi x ,p v ,a x ,a y ,p) 
1 



2-K(J x <J y \f \ - p 2 



exp(- 



1 



2(1 -p 2 ) 



(y-p y ) 2 2p ■ (x - p x )(y - p y ) 



))■ 



Applying the Baycs' theorem, we have the following re- 
lation: 



P{k'\l',s) cx P(l,s\k')P{k') 

cx <^(A;';A i fc / |J',s') £r fc'|!',8')5 



(A12) 



where 

Pk'\l' 

<?k'\i' 



Tk 



(-VK k + 1)1 + KkS + Tk, 

V(l - /0 2 ) • Kfc/K; • a//3 • 
/3/(/3 2 + oi/oi), 

— {PPA + Pb(T A /(Tb) ■ - Acr^|,, 



and the support of P(k'\l',s') is log(fc m ) < k' < oo. In 
other words, fc'|Z', s' obeys the truncated normal distribu- 
tion with the following parameters; the mean is Pk'\l',s', 



< k'\l',s'> . 5 

= (T k '\l', S ' ■ % 1 (®0(Z) + 5(1 - *o(*))) + Mfc'|Z', S 



(A13) 



where 



and, 



log(fc m ) - Mfc'|i',s 



*o(a?) 



ex P(--^-)*. 



Therefore, we have the following scaling relation: 

< k\l, s >o. 5 oc r^ Kfc+1 • s Kfc (a -> -00). (A14) 
This equation is Eq. (gTJ in Sec. EOT] 



Appendix B: Conditional distributions of l\k, s\k and 

s\l 

We now calculate the conditional distribution for two 
variables. Let us consider the conditional random vari- 
able I for given k, l\k. From Eq. (|A2[) . wc get the condi- 
tional distribution of V given by k! as 

P{l'\k')=4>{l';pB + k',aB). (Bl) 

Thus, the distribution of Z | fc is given by 

P(l\k) = cf> , (l;pB+log(k),aB). (B2) 

ThereforeCthe conditional percentile lOOq of l\k is writ- 
ten as 

< l\k > q = exp(< l'\k' > q ) oc exp(log(fc)) oc k. (B3) 

Eq. dl3| corresponds to Eq. (|22) in Sec. ET2~2l 

Next, we discuss s\k. From Eqs. (|A1|) and (|A2|) . we 
obtain 

P(s'\k') = (t>(s'; p A + pp B + (a + p)k', a s ,\ k i), (B4) 
where cr s ,| fe , 



2^2 



or a 



B ' 



Accordingly Cthc distribution of s\k is 

P(s|fc) = (j>'(s';p A + PPB + ylog(k),a s i lk i) (B5) 
Thus, the conditional percentile 100g of s\k is written as: 
< s\k > q = exp(< s'\k' > q ) oc exp(i/log(fe)) oc fc". (B6) 
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This equation is Eq. ([23]) in Sec. l3~2~2l 

Similarly, we consider s\lD From Eq. (|B1[) and Bayes' 
theorem: 

P(k'\l') cx P(l'\k')P(k') cx (j)(k';l'-n B -\a 2 Bl a 2 B ), (B7) 

where the support of this distribution is log(fc m ) < k! < 
oo. Here, from Eq. (|A3[) . we have the following relation: 

(B8) 



s'\l' = A' + al' + pk'\V 



Note that A and k'\V are independent of each other, so 
by taking a convolution of the probability distribution 
function of A and k'\V , we arrive at the following relation: 



P{s'\V) cx / 4>(s' - x;n A + al',cr A ) 

J/31og(fc m ) 

• (f>(x; (3(1'- hb - Atr|), f3a B )dx 
cx <t>{s';M' l {l'),S 1 ) 

■ {(1 - $(/31og(fc m ); M^s', l'),S 2 )} , (B9) 



where 



M{(1') = vV +fx A + 0(-/i B - Aa|), 



ft = \M+/3 2 a|, 



ft 



1 1 



and 



r 

3>(:r; er) = / <ft(t; fi,a)dt. 

J — oo 

ConscqucntlyCthc distribution of s\l is estimated by us- 
ing 



1 



P(s\l) oc -0(log(s);Mi(O,5i) 
s 

• {l-$(/31og(fc m );M 2 ( S ,Z),S 2 )}, 

(BIO) 

where M X (Z) = M((log(Z)) and M 2 (s) = M^(log(s)). 

If ct^/ct^ > a/3, we can get the following asymptotic 
behavior for I — > oo: 

P(s\l) cx ^(log(s); Mi(0, Si) « 0'(.s; M^Z), ft). 

(Bll) 

Therefore, we have the following scaling relation: 

< s\l > q rx exp(z/log(Z)) cx i". (Z->oo). (B12) 
This equation is Eq. §H§ in Sec. l3~!Cfl 



Appendix C: Marginal distributions of k, I and s 

Finally, we calculate the marginal distributions of k, 
Z, and s. Because of the definition of k, the marginal 
distribution of k is the same as the distribution of K. 
Therefore, 

\k x 

P(k) = oc fc- A - 1 (fc m <fc<oo). (CI) 

km. 



This equation corresponds to Eq. (|25|) in Sec. 13.2.31 

Next, we consider the marginal distribution of I. Eqs. 
(|A1[) and (|A2|) mean that Z is the sum of two indepen- 
dent random variables: K' and B' . Therefore, taking the 
convolution of the PDF of B' and K' . we have 



p(0 



log(fc m ) 



(/>(/' — x;fiB, cs)Aexp(— A(x — x m ))da 



exp(-Z'A) {1 - $(log(fc m ); Z' - /x s - <r|A, <r s )} . 

(C2) 



Thus, the marginal distribution of Z is 

P(l) cx i- x - 1 {l-$Oog(A fn );log(0-A*B-oiA, £ r B )}. 

(C3) 

Because $(log(fc m ); log(Z) — HA~ ^A, erg) — > for Z — > 
oo, we have following asymptotic behavior: 



P(Z) oc Z 



-A-l 



(C4) 



This is Eq. ((26D in Sec. 13331 

Similarly, we calculate the marginal distribution of sD 
From Eqs. (|Al~j) . (|Al]) and fA~3)) , we get 



s' = vK' + A' + aB'. 



(C5) 



Then, its PDF is obtained by taking the convolution A + 
aB with the PDF <p(x, /x^i + a/xs, <r s >\k') and ^-fC with the 
PDF A/j/cxp(A/j/(a; - x m )): 

p OO 

Jlog(/c m ) 

• — exp( (x — x m ))dx 



v 



s'X 

oc exp( ) 



■ 1 1 - $(log(fc m ); s' — ha - a/is - 

Thus, the marginal distribution of s is 
P(s) oc s~^T? _1 

• < 1 - $(fc m ; log(s) - ha- a^B - 



i<V|k') 



(C6) 



") <V|k' 



(C7) 
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For s — »• oo, we get the following asymptotic behavior: This is Eq. (|27|) in Sec. 13.2.31 

P(s)(xs-^~ 1 (s->oo). (C8) 
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